Predicting plant protein subcellular multi-localization by Chou's PseAAC formulation based multi-label homolog knowledge transfer learning

J Theor Biol. 2012 Oct 7:310:80-7. doi: 10.1016/j.jtbi.2012.06.028. Epub 2012 Jun 27.

Abstract

Recent years have witnessed much progress in computational modeling for protein subcellular localization. However, there are far few computational models for predicting plant protein subcellular multi-localization. In this paper, we propose a multi-label multi-kernel transfer learning model for predicting multiple subcellular locations of plant proteins (MLMK-TLM). The method proposes a multi-label confusion matrix and adapts one-against-all multi-class probabilistic outputs to multi-label learning scenario, based on which we further extend our published work MK-TLM (multi-kernel transfer learning based on Chou's PseAAC formulation for protein submitochondria localization) for plant protein subcellular multi-localization. By proper homolog knowledge transfer, MLMK-TLM is applicable to novel plant protein subcellular localization in multi-label learning scenario. The experiments on plant protein benchmark dataset show that MLMK-TLM outperforms the baseline model. Unlike the existing models, MLMK-TLM also reports its misleading tendency, which is important for comprehensive survey of model's multi-labeling performance.

MeSH terms

  • Artificial Intelligence*
  • Computational Biology / methods*
  • Databases, Protein
  • Plant Proteins / metabolism*
  • Protein Transport
  • Sequence Homology, Amino Acid*
  • Software*
  • Subcellular Fractions / metabolism

Substances

  • Plant Proteins